boundary point
A Mean Curvature Approach to Boundary Detection: Geometric Insights for Unsupervised Learning
Accurate boundary detection in high-dimensional data remains a central challenge in unsupervised learning, particularly in the presence of non-linear structures and heterogeneous densities. In this work, we introduce Mean Curvature Boundary Points (MCBP), a novel geometric framework grounded in Geometric Machine Learning that departs from traditional density-based approaches by explicitly modeling the intrinsic curvature of the data manifold. The method relies on a discrete approximation of the shape operator, estimated from local k-nearest neighbor patches, to compute pointwise mean curvature without requiring explicit manifold parametrization. The key insight of MCBP is to use mean curvature as a principled descriptor of boundary structure: high-curvature regions naturally correspond to transitions between clusters, geometric irregularities, and low-density interfaces. This yields a unified geometric interpretation of boundary, outlier, and transition points. We further introduce an adaptive percentile-based thresholding scheme that enables multiscale boundary extraction without relying on ad hoc density parameters. Beyond detection, we propose a curvature-driven data decomposition that separates samples into smooth (low-curvature) and boundary (high-curvature) subsets, effectively acting as a non-linear geometric filtering mechanism. This representation enhances cluster separability and improves the robustness of downstream unsupervised algorithms. Extensive experiments on synthetic and real-world datasets demonstrate that MCBP consistently improves clustering performance, particularly in complex and high-dimensional scenarios. These results position MCBP as a concrete contribution to Geometric Machine Learning, highlighting the potential of curvature-aware analysis as a unifying paradigm bridging differential geometry and data-driven modeling.
A Statistical Framework for Spatial Boundary Estimation and Change Detection: Application to the Sahel Sahara Climate Transition
Tivenan, Stephen, Sahoo, Indranil, Qian, Yanjun
Spatial boundaries, such as ecological transitions or climatic regime interfaces, capture steep environmental gradients, and shifts in their structure can signal emerging environmental changes. Quantifying uncertainty in spatial boundary locations and formally testing for temporal shifts remains challenging, especially when boundaries are derived from noisy, gridded environmental data. We present a unified framework that combines heteroskedastic Gaussian process (GP) regression with a scaled Maximum Absolute Difference (MAD) Global Envelope Test (GET) to estimate spatial boundary curves and assess whether they evolve over time. The heteroskedastic GP provides a flexible probabilistic reconstruction of boundary lines, capturing spatially varying mean structure and location specific variability, while the test offers a rigorous hypothesis testing tool for detecting departures from expected boundary behaviors. Simulation studies show that the proposed method achieves the correct size under the null and high power for detecting local boundary shifts. Applying our framework to the Sahel Sahara transition zone, using annual Koppen Trewartha climate classifications from 1960 to 1989, we find no statistically significant decade scale changes in the arid and semi arid or semi arid and non arid interfaces. However, the method successfully identifies localized boundary shifts during the extreme drought years of 1983 and 1984, consistent with climate studies documenting regional anomalies in these interfaces during that period.
Towards Personalized Treatment Plan: Geometrical Model-Agnostic Approach to Counterfactual Explanations
Sin, Daniel, Toutounchian, Milad
In our article, we describe a method for generating counterfactual explanations in high-dimensional spaces using four steps that involve fitting our dataset to a model, finding the decision boundary, determining constraints on the problem, and computing the closest point (counterfactual explanation) from that boundary. We propose a discretized approach where we find many discrete points on the boundary and then identify the closest feasible counterfactual explanation. This method, which we later call $\textit{Segmented Sampling for Boundary Approximation}$ (SSBA), applies binary search to find decision boundary points and then searches for the closest boundary point. Across four datasets of varying dimensionality, we show that our method can outperform current methods for counterfactual generation with reductions in distance between $5\%$ to $50\%$ in terms of the $L_2$ norm. Our method can also handle real-world constraints by restricting changes to immutable and categorical features, such as age, gender, sex, height, and other related characteristics such as the case for a health-based dataset. In terms of runtime, the SSBA algorithm generates decision boundary points on multiple orders of magnitude in the same given time when we compare to a grid-based approach. In general, our method provides a simple and effective model-agnostic method that can compute nearest feasible (i.e. realistic with constraints) counterfactual explanations. All of our results and code are available at: https://github.com/dsin85691/SSBA_For_Counterfactuals
GCAO: Group-driven Clustering via Gravitational Attraction and Optimization
Traditional clustering algorithms often struggle with high-dimensional and non-uniformly distributed data, where low-density boundary samples are easily disturbed by neighboring clusters, leading to unstable and distorted clustering results. To address this issue, we propose a Group-driven Clustering via Gravitational Attraction and Optimization (GCAO) algorithm. GCAO introduces a group-level optimization mechanism that aggregates low-density boundary points into collaboratively moving groups, replacing the traditional point-based contraction process. By combining local density estimation with neighborhood topology, GCAO constructs effective gravitational interactions between groups and their surroundings, enhancing boundary clarity and structural consistency. Using groups as basic motion units, a gravitational contraction strategy ensures globally stable and directionally consistent convergence. Experiments on multiple high-dimensional datasets demonstrate that GCAO outperforms 11 representative clustering methods, achieving average improvements of 37.13%, 52.08%, 44.98%, and 38.81% in NMI, ARI, Homogeneity, and ACC, respectively, while maintaining competitive efficiency and scalability. These results highlight GCAO's superiority in preserving cluster integrity, enhancing boundary separability, and ensuring robust performance on complex data distributions.
Interpret Policies in Deep Reinforcement Learning using SILVER with RL-Guided Labeling: A Model-level Approach to High-dimensional and Multi-action Environments
Qian, Yiyu, Nguyen, Su, Chen, Chao, Zhou, Qinyue, Zhao, Liyuan
Deep reinforcement learning (RL) achieves remarkable performance but lacks interpretability, limiting trust in policy behavior. The existing SIL VER framework (Li, Siddique, and Cao 2025) explains RL policy via Shapley-based regression but remains restricted to low-dimensional, binary-action domains. We propose SIL VER with RL-guided labeling, an enhanced variant that extends SIL VER to multi-action and high-dimensional environments by incorporating the RL policy's own action outputs into the boundary points identification. Our method first extracts compact feature representations from image observations, performs SHAP-based feature attribution, and then employs RL-guided labeling to generate behaviorally consistent boundary datasets. Surrogate models, such as decision trees and regression-based functions, are subsequently trained to interpret RL policy's decision structure. We evaluate the proposed framework on two Atari environments using three deep RL algorithms and conduct human-subject study to assess the clarity and trustworthiness of the derived interpretable policy. Results show that our approach maintains competitive task performance while substantially improving transparency and human understanding of agent behavior.
MMGP_supplementary_material
Details regarding the datasets are provided in Appendix A. Morphing strategies and dimensionality Regarding the AirfRANS dataset, the reader is referred to [14]. Examples of input geometries are shown in Figure 6 together with the associated output pressure fields. The output scalars of the problem are obtained by post-processing the three-dimensional velocity. Examples of input geometries are shown in Figure 7. Figure 8: ( Tensile2d) Illustration of the Tutte's barycentric mapping used in the morphing stage. Notice that although these morphing techniques are called "mesh A zoom of the RBF morphing close to the airfoil for test sample 787 is illustrated in Figure 10.